Is it possible to maximize a monotone submodular function faster than thewidely used lazy greedy algorithm (also known as accelerated greedy), both intheory and practice? In this paper, we develop the first linear-time algorithmfor maximizing a general monotone submodular function subject to a cardinalityconstraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, canachieve a $(1-1/e-\varepsilon)$ approximation guarantee, in expectation, to theoptimum solution in time linear in the size of the data and independent of thecardinality constraint. We empirically demonstrate the effectiveness of ouralgorithm on submodular functions arising in data summarization, includingtraining large-scale kernel methods, exemplar-based clustering, and sensorplacement. We observe that STOCHASTIC-GREEDY practically achieves the sameutility value as lazy greedy but runs much faster. More surprisingly, weobserve that in many practical scenarios STOCHASTIC-GREEDY does not evaluatethe whole fraction of data points even once and still achievesindistinguishable results compared to lazy greedy.
展开▼